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When the Massachusetts Advisory Council on Education contracted with 
Educational Research Corporation for a study aimed at documenting the 
workings of ^'successful" inner-city schools in teaching children to read, the 
complications of conducting such a study were only dimly comprehended. The 
Council was aware chat there would be definition problems with such words and 
phrases as "inner-city" and "successful" schools and practical problems with 
gaining access to "unsuccessful" schools to see how they contrasted with their 
"successful" counterparts. However, as this technical report shows, these 
problems were only the tip of the proverbial iceberg I 

Hidden from view, when the study was launched over two years ago, were 
such problems as: 

-developing valid measures of poverty (so as to identify 
inner-city schools) and locating reliable data to support 
these measures. 

-making different measures of "success" comparable and deciding 
the basis for selecting "unsuccessful" (or contrast) schools. 

-selecting and developing operational definitions for the factors 
(presumably having a bearing on the "successful" teaching of 
reading) to investigate. 

-designing appropriate procedures and instruments to elicit valid 
and reliable data on the study schools. 

It is to the credit of the study team, led by Dr. Richard Willard, that 
they not only faced these problems frankly, honestly and patiently but also 
demonstrated a willingness to change and revise procedures and plans as the 
need arose. As a result, the reader will find the study's methodology and 
instrumentation clearly spelled out so as to allow judgments on the study's 
research adequacy to be made. To many of us associated with the study, the 
approach and procedures used are superior to any studies of this type yet 
undertaken in the country. 

One cautionary note. For readers to use the study instruments employed 
in this study without undertaking the changes and preprations advised by the 
study team can only lead to invalid results ^d wasted effort. Nevertheless, 
the Council hopes readers will find in the report (and the report summary) 
enough ideas and suggestions on how schools might proced to warrant moving in 
the directions recommended by the study team. MACE would certainly want to 
hear from readers who decide to move in such directions. 

Allan S. Hartman 
Associate Director 
Advisory Council on Education 
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Looking to research for guidance in how to teach children to read 
has proven fruitless for many who are concerned about reading. Some 
research, in fact, suggests that there is hardly anything that schools 
can do, since home influences are so dominant in learning to read. The 
study directed by James Coleman for the U.S. Office of Education concluded 
that schools have little influence separable from the backgrounds of 
their students. A similar study conducted by the International Association 
for the Evaluation of Educational Achievement (lEA) found very little 
evidence of the impact of schools on reading. Many researchers have 
concluded therefore that schools do not make a difference. 

There are other researchers, though, who conclude that schools do 
make a difference. Some have used survey techniques to identify school 
factors associated with reading performance. 

Guthrie, in providing a summary of several such studies, noted that 
there is little doubt that schools are important, but the survey studies 
still do not suggest what schools must do to succeed. A different 
approach from the use of a' survey was taken by George Weber and described 
in his monograph Inner-City Children Can be Taught to Read: Four Successful 
Schools . Weber searched for and found four city elementary schools 
whose students were at the national norms in reading. After visiting 
each school Weber identified eight factors that the schools shared and 
that seemed to explain their success: strong leadership, high expectations, 
good atmosphere, strong emphasis on reading, additional reading personnel, 
use of phonics, individualization, and careful evaluation of pupil 
progress. 
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The approach use^ by Weber to identify successful schools and to 
look at them was recognized by the Massachusetts Advisory Council on 
Education (MACE) to be an appropriate approach to be used in the ConunonweaJth 
to identify what factors are associated with reading success in Massachusetts 
city schools. MACE commissioned Educational Research Corporation (ERC) 
to search for successful schools in Massachusetts cities and to identify 
what these schools did to become successful. 

Identifying the Study Schools 

By using the 1973 allocations of funds from the Federal ESEA Title 
I program the study staff identified the ten Massachusetts cities with 
the largest allocations and thereby the largest numbers of poverty 
students. Ultimately nine of these cities agreed to participate in the 
study. (Note 1) 

ERC gathered poverty data, (which consisted of (1) the proportion 
of children in an attendance area who according to Title I applications 
were from low income families and (2) the proportion of the attending 
students who were recipients of free lunch or free milk) on each elementary 
school in the nine cities. These two measures of poverty are of course 
fallible. It is the case, for example, that cities differ in the ways 
they compute the proportion of children from low income families, some 
using census data, some using welfare rolls, and some a combination of 
both, and that the identification of students eligible for free lunch or 
free milk programs is frequently subject to misreporting of income by 
parents and the judgment by the principals of those in need but otherwise 
not identified. The combination of these two measures, however, is 
still better than either is alone, and so serves as a generally valid 
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me^sure of poverty in the schools. Each measure of poverty was used to 
rank the elementary schools in each city, and the average of the Zmo 
rank values was used as the final ranking of the schools from high to 
low poverty. This ranking was designed to identify poverty schools by 
means of data that are consistent across cities, but evidence collected 
later in the study showed that the poverty classifications were not as 
uniform as they appeared to be at first and that the identification of 
poverty is itself a complex matter. 

Average test scores in reading were added to the ranked list of 
schools in order to identify those poverty schools whose students were 
reading at or above grade level according to standardized tests. These 
test scores were from existing records of city-wide testing programs 
rather than a special administration of tests since the Study Review 
Committee strongly urged that poverty children not be subjected to any 
more tests. They are already tested extensively, and one more test 
would prove both burdensome and redundant. 

The test results were derived from the chores of sixth graders 
because sixth grade test results draw upon the cumulative learning of 
reading throughout the elementary school years, and a student scoring at 
grade level in his last grade in elementary school has probably been 
progressing successfully through most of his earlier years in school. 
Students may experience different sequences of reading instruction or 
different approaches to reading that might result in some early deviation 
from the norm, but if they converge later upon the norm in the sixth 
grade the school should still be judged to be successful. 
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ERC focused the inspection of sixth grade scores among the poverty 
schools where the general performance was low. In the nine cities there 
were, nevertheless, ten poverty schools that had students performing at 
the national norms or better, and the schools therefore stood out from 
the rest. These ten schools represented a cross-section of city schools. 
They were generally situated close to the center of the cities, which 
were of somewhat different sizes, with school populations varying from 
approximately 10,000 to 100,000 students. Most of these successful 
schools were in old, traditional buildings with self-contained classrooms, 
but some were in newer buildings with open spaces. Most were neighborhood 
schools, but some had students who were bused. Varied as they were, the 
ten schools constituted a target group to be studied in detail to uncover 
what characteristics might have helped to set them apart on test scores. 

Identifying Contrast Schools 

Identifying characteristics of successful schools, no matter how 
carefully done, does not necessarily explain what those schools did to 
become successful. To find what they did to become successful would 
require a study of schools over an extended period of time as they 
passed from a failing status to a successful one. Without the advantage 
of such a study over time it is critical to identify which characteristics 
of successful schools are not also shared by unsuccessful schools. 
These characteristics might explain what the school? did to become 
successful because they are different. 

In order to highlight factors on which the successful schools 
differ the staff identified a set of contrast schools which were not 
successful. Schools can differ in many ways, some of which are external 
to what the school does, so the contrast schools were selected deliberately 
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to be similar to the successful schools in terms of the poverty levels 
of their children. For each successful school another school was found 
that matched on the two poverty measures » Title I and free lunch percentages, 
and <^as similar also in racial composition and the proportion of bilingual 
students. Comparable to the successful schools on measures of poverty 
as well as racial and bilingual composition » the contrast schools had 
reading scores that were^ on the average^ 1.3 grade equivalents below 
national norms and thereby quite below the successful schools. (Note 2) 

Studying the Schools 

Since the two sets of schools differed on test scores but matched 
on a nuiaber of external measures » some other factors were needed to 
explain how they differed in the performance of their students. 

Possible factors for study were the sets of variables that had been 
identified in one or another of the surveys, as, Coleman, Guthrie, and 
others, but those varia-^les were not consistent correlates of student 
success. Further, those variables tend to be quantitative, as, per 
pupil expenditure, class sizes, and teacher aptitude, and they fall 
short of the qualitative dimensions that better describe what actually 
occurs in schools and that better describe the direct influences upon 
students. The studies by George Weber and others identified variables 
that go beyond the quantitative to include qualitative dimensions of 
schools, and these were judged to provide i richer source of factors to 
be studied. 

The project staff drew upon the eight factors identified by George 
Weber to provide the labels for factors to be studied in this project, 
but it was deemed important to define the factors with a strong emphasis 
upon the processes involved. The Weber factors were richer than simple 
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quantitative dimensions because they dealt with sctool processes, but 
the project staff elected to stress procedures even M>re. Thus» for 
exa«ple» the staff vent beyond the identification of additional reading 
personnel » a Weber factor , to the cons ]^ deration of irfiat rcles and functions 
are performed by the reading personnel. The Neber factors vere redefined 
in operational terns; sose i#ere altered; ani so«e were split up. "^t is 
not isportant here to identify where and how changes were made because 
the factors studied can be described in sufficient detail that they 
stand alone for scrutiny. Below are given the descriptions of the 
factors used in this study. 
A. Leadership 

Nhen there is strong leadership in reading instruction in a school, 
the staff will agree unanimously on provides the leadership- --whether 
it be the principal » a school reading specialist or someone from the 
central office* -and observers will readily detect who provides the 
direction to the teaching of reading. A leader will display inspiration » 
empathy, and flexibility in providing the staff encouragewnt to do its 
very best. 
8» Coord i nation 

Good coordination in the teaching of reading means that students 
experience across and within grade levels activities that reinforce each 
other. Work at any grade is related to work in previcms grr«des, and the 
several supplementary reading services, remedial. Title I, ot learning 
disability, which take place outside the classro<m are still related to 
wha/. transpires inside the classroom. Coordination can be achieved 
simply by standardizing all reading activities, but a school with many 
varied learning activities must see that the activities are well "orchestrated". 
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C. iUlditroiml Reiiin| Ferscnmtl 

Additimal reading peramnel, that is other than regular classroon teachers* 
liecow significant when there is a variety of personnel: schcnil reading 
specialists; Title 1 pers<nmel; learning disability specialists; aides and 
others to work with students singly or in mill groups. Even the school 
librarian can be an additional resource for reading. Good use of personnel Mans 
alse that reading specialists share tteir expertise with teachers by being a 
rescHirce for reading instruct icm in classrooM. 

D. Atiaosi^re 

Good atMOSf^ere in a school neans that pwple in the school are 
relaxed and without tension, that the school is operating in an orderly 
fashion, that the students are purposeful in their activities, and that 
any noise is noml with no evidence of disn^tive clatter. 

E. Individualization 

A truly individualized program respofMis to individual differences 
in background, learning styles, aiui rates of learning by neans of diagnostic 
procedures end prescriptions that differ by individual either in curricula 
materials used or in study tines required or in a coo^ination of both. 
P. Evaluation 

Sound evaluation of pupil progress employs several measurement 
techniques, as, teacher cotistructed tests, curriculuft tests, criterion- 
referenced tests, standardized tests, and so on. The evaluations are 
most effective when reading progress records fallow the student from 
grade to grade and from teacher to teacher and when they are used in 
developing instruction strategies. 




E<htcaltof^ Rese^ ch Corporation 



-i- 

G. With Ejtpectations 

Mhri tlie staff of a school has high enpectatimis for studants in 
genaral and for the sti^nts in that school in particular, high stai^rds 
Will be set for the students, and encoungewnt will give.i for good 
perfoniance. 

H. Strong Eaphasis on Reading 

A school that places a strmg »phasis iqpon reading devoter aai^le 
tiM to reading instn^ticm 9nd «akes available a nunber and variety of 
reading materials. A priority for reading usually laewns that reading is 
taught at the begitming of the day wHen children are wst alert. 
!• tfae of Phonics 

A strong use cf phonics in reading instruction neans that decoding 
skills are developed in the early grades and that phonics is an integral 
part of the curriculUK as represented in the instructional materials 
used» basal or suppleamitary. 
J. Staff Training and Experience 

A i#ell trained and experienced staff will have hah mensive formal 
education incliuiing graihiate study and special courses in reading institicticm 
either in college or in an in-service program and will have been «tforking 
in education professionally for som tiM. There will be evidence that 
the staff stays contemporary. 
K. Quality of Teaching 

A good teacher will manage a classroM we . such that stiKlents vill 
be productive during study time and will be suqpportive of ^miividual 
students particularly as they make mistakes or falter. 
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Visiting the Schools 

The framework of factors to be studied made it clear to ERC that 
there should be many methods of collecting data, and, particularly, that 
interviews and direct observations should be during visits to the 

schools. School visits can produce reports that are affected by reporter 
bias, so ERC ensured that no visitor would know whether the school w?i<; 
successful or not and that the visitor could not produce reports of 
observations that were biased by any personal expectations about factors 
of success. ERC further decided to use a team of visitors to each 
school so as not to be dependent upon a single visitor's perceptions. 
Each team of visitors had three members, some of who visited twice in 
order to provide a total of five visits to each school. One member o* 
the team who visited twice was a reading expert selected to have general 
knowledge of the principles of reading instruction and, as well, familiarity 
with schools and how reading is actually taught. Thus, every school was 
visited by a reading expert. Two visits were made by a research associate 
from the staff of ERC. The fifth visit to each school was usually made 
by the assistant project director to give continuity across the different 
visiting teams. The five visits to a school were planned on different 
days to provide experiences that were as different as possible and to 
reduce the effects of special occasions, as, for example, field trips 
and assemblies, that occur in schools. 

Five classrooms were selected randomly by ERC, and each was visited 
on two different occasions by a different observer. This produced a 
total of ten reading periods observed in each school. The five classrooms 
were selected to have one from each of the grades one through five.. 
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CoUecting Data 

ERC used observation schedules in the classrooms that were designed 
to collect data relevant to the study factors. Simple data included the 
number of students and number of aides present. Notes were made about 
the evidence of a reading area in the room and about the availability 
and accessibility of various reading materials. Rankings and notes were 
made about the classroom atmosphere and about the interactions between 
teachers and students. On four different occasions during the reading 
period the observer systematically checked on each child in the room and 
indicated on a form what sort of activity the child was engaged in. 
When tallied, these observations showed, for example, how many children 
were involved in reading instiruction with a teacher, with an aide, with 
other students, or alone. The tallies also showed when work was being 
done on non-reading activites. The combination of all the counts provided 
a Learner Activity Index showing how student time was distributed over 
the several activities. 

In addition to observing classrooms, the visitors conducted structured 
interviews. Each of the five teachers who had been observed was interviewed 
after one of the observations, and the interview protocol provided 
several questions, particularly questions that related to leadership, 
coordination, and individualization. The reading expert interviewed the 
school reading specialist on one day and on another day the school 
librarian, whenever there was one. ERC staff visitors interviewed the 
building principal as well as the person or persons in the school district 
central office most responsible for the reading program in the school 
system. 

O Educational Research Corporation 
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Even though protocols were structured for interviews and observations, 
the visitors were encouraged to use additional space on the forms to 
describe any findings that were worthy of note. These non-structured 
comments revealed, for example, that in one school the staff expressed 
the feeling that the students were protagonists who must be controlled 
and that the staff was compelled to constantly monitor movement in the 
hallways. 

The observations c*nd interviews provided rich descriptions of the 
schools, and the study staff collected statistical data as well so that 
the combination of clinical and statistical data would give in the 
aggregate richer information than either source alone. The study staff 
prepared a number of questionnaires that allowed respondents to provide 
additional data about the multiplicity of factors being studied. The 
principal, the school reading specialists, and the teachers who had not 
been interviewed completed questionnaires that provided data about 
themselves as, for example, their training and experience and about the 
school as they perceived it. 

Quite often the same data were requested of several respondents 
which allowed comparisons among responses that came from different 
sources. Differences between responses showed when there were different 
perceptions regardless of the actual facts. 

There were several other forms that provided statistical data. On 
one, the teachers of grades one through three provided for each of three 
days a time log showing the distribution of teacher time devoted to 
different activities from which ERC could infer the amount of teacher 
time devoted to reading instruction., A second form elicited from the 
principal certain data about the school as, for example, racial mix, the 
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extent of bilingualism, student mobility, and the like, to verify the 
data that had been used to match the pairs of schools to make them 
comparable. A third form, a survey of knowledge about classical children's 
literature adapted from a standard test of this sort prepared by Charlotte 
Huck, was given to the teachers of grades four through six anonymously. 
The teacher scores were partial evidence for the training and experience 
factor. 

Two survey instruments were administered to sixth grade students. 
One randomly selected group completed an inventory composed of forty- 
five questions which were designed to uncover not only positive attitudes 
toward reading but also indications whether or not reading was a preferred 
activity,; Another group of sixth graders completed questionnaires about 
home backgrounds, indicating the presence or absence of applicances in 
the home, the educational levels achieved by parents, and information 
about parental aspirations and support of the child as a student. Such 
data related to verifying that the backgrounds of the students were 
similar in the two sets of schools, ERC recognized that while the 
students were not the most accurate source of the data, people still do 
act according to what they believe to be true even if that differs from 
what is actually time. 

Instrumentation for Data Collection 

Data related to each of the factors were clearly collected from 
many different sources. Table 1 provides a summary of how many items on 
various interviews, questionnaires, or other instruments were used to 
collect data on each factor. To illustrate how to interpret the summaries, 
consider leadership. The table shows that one question in each teacher 
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interview dealt with leadership data as well as three questions on 
principal and teacher questionnaires and two on the questionnaire completed 
by reading specialists. 

The instruments varied among each other with some natural differences 
in that open-ended questions most frequently appeared in interview 
protocols so that probes could be made to clarify and expand responses. 
Following is a listing of the questions used in the Central Office 
Interview protocol as an illustration. The actual form provided space 
for interviewers to enter responses, but this listing shows the questions 
and the foreseen probes. 

CENTRAL OFFICE INTERVIEW 

1. Do you have specific goals and objectives related to the reading 
program in your schools? 

YES NO 



Are they in writing? 

YES NO (IF YES) May I please have a copy? 

2. How are revisions made to the goals and objectives? 

3. Does the school system have a curriculum guide in reading? 

YES NO 

(If YES) May I please see or borrow a copy? 

(If YES) When was the curriculum last revised? 

(If YES) Who is responsible for revising and updating the guide? 

4. Are the teachers given specific checkpoints (e.g., pages, books) that 
they should read by given dates? 

YES NO 



S, Are there particular features of your reading program that you think 
others might find beneficial? 

YES NO 
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6. Do you find any weaknesses in your reading program? 

YES NO 

(If YES) What? 

7. Are there people in the schools besides you and classroom teachers who 
assist students with reading? Please describe. 

YES NO 

(If YES) Who determines their function? 

8. Is there a reading diagnostic program in the school? 

YES NO 

(If YES) How does it operate? 

Are all children involved? 

How many specialists are involved? 

Are special prescriptions made? 

9. Do roost of the teachers in your schools individualize the reading program 
for each child? 

YES NO 

(If YES) How? 

10. What records are kept on reading performance of the students? 
If possible, may I please have samples of the record sheets? 

11. Please describe any in-service training programs in reading conducted 
for teachers . 

Who conducts the program? 

Are you involved in the program? In what way? 

12. Do you draw upon resources outside the schools (consultants, colleges, 
institutions) to help in the reading program? 

YES NO 

(If YES) Please describe. 
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Questioimaires contained mostly questions with straightforward 
answers which required no amplification by respondents and, as such, 
were not unlike those used in other survey studies that relied entirely 
upon questionnaire responses. This study further relied upon observation 
instruments to provide other data. One such instr.jment was used to 
record four times during a reading class the numbers of students participating 
in different forms of activities according to the following outline of 
activities: 

I . Academic 

A. Skill learning 

1 . Reading 

a. Alone 

b. With others 

1 . Teacher^ 

2. Aide 

3. Student (s)^ 

4. Combination of above^ 

2. Non-reading (art, music, directed play) 

B. Logistics (preparation for lesson) 

II. Non-Academic (being disciplined, inactive, eating) 

Finally, some of the instruments not only provided for but encouraged 
the observers to make comments about what they saw that fit no given 
structure. These comments provided clinical data--essentially anthropological 
data--about classrooms observed, the staff, and the school in general. (Note 3) 
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Analyzing the Data 

The data collected about the twenty schools came from many different 
sources and were in different forms ranging from "hard" data, as, numbers, 
to "soft" data, as, anecdotes. Numerical and categorical, hard data 
could easily have been summarized statistically for comparison purposes 
while descriptive and anecdotal, soft data were more amenable to clinical 
analysis. The study staff elected not to conduct statistical and clinical 
analyses separately from each other, but instead to combine the two in 
order that the analysis would have the strengths of each combined. To 
combine the two approaches, the staff used collective judgments. These 
collective judgments were made factor by factor following a review of 
all the data collected about the schools by a team of at least four 
people, including staff who had visited the school because they had 
first hand experiences important to the judgment making. 

A summary of all the data collected and arranged according to the 
study factors was prepared by one of the staff who had visited the 
school, and then was presented to each member of the group responsible 
for making the judgments. Following independent reading of the clinical 
and statistical data, the group assembled and discussed the data about 
each factor searching for clarification whenever the data were conflicting; 
frequently those vho had visited the school were able to explain or 
amplify upon the data. Following the discussion each member of the 
group made a rating that indicated his or her judgment about whether the 
data about the study factor suggested that the school was either a 
successful or a contrast school. These. Mtings were made without knowing 
yet whether the school was a successfiil or a contrast school. 
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A rating scale of five points was designed to express the likelihood 



that the school was either a successful school or a contrast school. 



Following is a description of each scale value: 



Scale Value 



Description 



5 



On this factor alone this school definitely 
appears to be a successful school. 



4 



On this factor alone this school probably 
is a successful school. 



3 



On this factor alone this school could be 
either a successful or a contrast school. 



2 



On this factor alone this school probably 
is a contrast school. 



1 



On this factor alone this school definitely 
appears to be a contrast school. 



When the ratings were all the saM, that unanimous rating became 
the consensus rating. When the ratings differed, however, consensus was 
not achieved by simply averaging the ratings. Rather, further discussions 
served to clarify the data and new ballots were made. On a new balloting 
individual ratings were often changed. The cyclic process of balloting 
and discussing was repeated until there was unanimity by the group. 

The group process not only resulted in unanindty but resulted in 
finer distinctions than originally planned when the groups introduced 
pluses and minuses to the five point scale to reflect their finer judgments. 
Beginning with a five point scale, the addition of pluses and minuses 
ultimately resulted in a rating scale of thirteen points. (Note 4) 

The consensus rating approach was used to obtain ratings based upon 
the clinical and statistical data collected for each factor except the 
Use of Phonics factor. That factor was tiot based upon clinical data to 
the same extent as other factors, so a different approach was used. ERC 

Q Educational Research Corporation 



ERIC 



22 



•19- 

had one of the reading experts inspect the different basals and supplementary 
materials that each teacher reported using and classify then according 
to the extent to which they relied upon phonics and decoding. The staff 
then computed for each school the average s}f the expert's ratings, whicn 
were based upon the following scale: 



Rating Description 

5 Mainly phonics. 

4 Phonics mixed with other skills. 

3 Some phonics. 

2 Partially phonics. 

1 No phonics. 



Analysis of Ratings 

In Table 2 are given the ratings for the twenty sch3ols by each 
study factor. The ratings for the successful schools are grouped together, 
and the ratings for the contrast schools are also grovoed together. 

One way to analyze the ratings for the two groups of schools is to 
compute means for each group and coiqpare them. Table 3 shows the group 
means coiq>uted by assuming that a plus is equivalent to adding one- 
third of a rating point and that a minus (-) is equivalent to subtracting 
one* third for the group consensus ratings. 
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TAiLE 3 

Mean Ratings of Groves of Schools 



Factors 


Successful 
Schools 


Contrast 
Schools 


A. Leadership 


2.47 


2.60 


B. Coordination 


2.33 


2.&0 


C. Additional heading 
Personnel 


2.67 


3.03 


0. Atmosphere 


3.00 


3.17 


E. Individualizati(m 


2.37 


2.33 


F. Evaluation 


2.43 


2.53 


G. Expectation 


2.17 


2.13 


H. Strong Ei^ihasis 


3.23 


3.27 


I. Use of Phonics 
iasal 

Supplement ary 


2.79 
3.45 


2.90 
3.57 


J. Training and Experience 


2.90 


2..-' 


Quality of Teaching 


2.77 


3.27 



ftone of the aean ratings of the successful schools is significantly 
different statistically fro« the corresponding mean for the ccmtrast 
schools, but the total number of schools is so small that such use of 
statistical tests lacks sufficient poirer to detect other than really 
large differences between the two groups. 



Even though the means tricen factor by factor show no significant 
differences, as a group there is soM suggestion that there is an overall 
difference, oddly enough in favor of the contrast schools. Among twelve 
possible comparisons of means tiat can be made from the data in Table 3 
there are nine c(Hq>arisous in which the contrast means exceed :he successful 
means. This number of differences in favor of one group cmnes close to 
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bcifif si^ificant (prc^bility of .07) tad suts«sts that a coaiparison 
bcttmen the g^^s leased upon all factors taken together rai.<er than 
singly ttight produce significance. 

To investigate the collective effects of the various factors the 
study staff used a Kore cosfilex analysis procedure, the auitiv^.riate 
statistical technique of Discriminant ^lysis, ^ich considers the 
Mans and interrelationships of the factors taken together in a limar 
cottbiMtion. This analysis, however, pr^toced no evi^mce of differeiM:es 
heti#een the ti#o gnHqps of schools. Since the linear wMtei ctoes mt 
Include interactims het%feen factors, the staff aade m effort to intrcNhice 
prcKkict terns into the discrininant analysis so that nain affects ai^ 
interaction effects could be assessed together. T*#enty schools provide 
too few degrees of fre^ios to include all fKirsible product terns, so 
ter»s were introduced only if there was a st^estiofi that there was sone 
interaction from inspection of the scores. T!» analysis of interact icrns 
by discrinimint techniques suggested that High Expectations interacted 
with a coarioination oi Individualization and Evaluation in such a way 
that the absence of High Expectation essentially discmaited any effects 
of Individualization and Evaluation. This interaction effect was not 
significant*- just as was the case with the cMparison of Mans, thr 
nuBber of schools is quite snall-^but it suggests the hypothesis that 
the Individualization! factor and the Evaluation factor that sl^ld 
accoi^y it do not proiktce positive effects unless the staff agrees 
that the stiulents are capable of learning. If such a hypothesis is 
true» it aay sean in operational terms that a staff with low expectations 
fails to diagnose weaknesses and prescribe assignments properly. The 
study provides no real evidence for this or for the hypothesis, however, 
because no data were found to be c<mvincingly siq>portive. 
^P^^ Eitocatkmai fleaaarcti C o fp or rtio n 
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Another form of analysis provides for comparisons between the two 
groups of schools on all the numerical and categorical data collected on 
the schools. Such comparisons naturally exclude the clinical data 
obtained, but nevertheless they are possible sources of meaningful 
differences between the two groups of schools. There were ninty-four 
data items for which it was possible to search comparisons between the 
two groups of schools for statistical significance. With this many 
items, about five are expected to produce differences apparently significant 
at the five percent level by chance alone, and one difference is expected 
by chance to be significant at the one percent level. In fact, there 
were fewer than five differences at the five percent level and none at 
the one percent level. These comparisons therefore failed to establish 
differences between the groups of schools just as all the other analyses 
had failed. 

The study staff reviewed other data about the schools to see if 
there were possible explanations other than the study factors for the 
differences between the schools. For example, a review of new data 
collected on the general compositions of the student bodies did show 
that one pair of matched schools did not really match well on the proportions 
of bilingual students because the proportion had increased in the contrast 
school. In another pair the socioeconomic data had not provided a good 
match. In that pair the successful school was found to have, in fact, 
the relatively high number of low income families, but the incomes had a 
bimodal distribution resulting in a relatively high number of middle 
class families as well. In some four other pairs there were variations 
in student mobility that had not been known earlier because some data 
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were missing. In each instance the contrast school had a more mobile 
student body, and that fact was a possible explanation of score differences. 
Students transferring into a school obviously bring with them experiences 
from other schools which may or may not have helped them to read and so 
may depress or elevate a school's total performance, and conversely a 
school that loses its students to other schools has its effects on 
reading dissipated. As a result of this analysis of other data from the 
twenty schools, some evidence of contaminating external effects existed, 
but again there was not sufficient evidence to explain all the differences 
in test scores. 

Thus far, the analyses used, however complex, failed to uncover 
consistent differences between the ratings of the two groups of schools. 
It does not follow, however, that all the ratings of the twenty schools 
are alike. This can be verified by direct inspection of the ratings 
given in Table 2, perhaps the simplest form of analysis. This inspection 
shows, for example, that the successful schools have a very mixed set of 
ratings. Some ratings of the successful schools are high, a fact which 
indicates that there is evidence of the presence of the study factors in 
the successful schools, but that evidence is scattered across the schools 
and across the factors. Ratings of 4, including 4- and 4-*- as well as 4, 
are easily seen to occur in several rows and in several columns, and no 
consistent pattern is apparent either by schools or by factors. In 
fact, schools that have high ratings on some factors also have low 
ratings on others. This means that they appear to be successful schools 
acc<^rding to some factors, and at the same time they appear not to be ' 
successful schools according to other factors. 
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Among the successful schools there are some with ratings so low 
that they appear not even to be successful schools according to ail or 
most all of the factors. Successful school #1, for example, has ratings 
that uniformly are interpreted as meaning it looks not at all like a 
successful school. Successful school #10 has ratings that are almost 
consistenly low but not so extreme. Thus, some successful schools 
appear as though they should be contrast schools instead. 

Not only do some successful schools appear instead to be contrast 
schools, but some contrast schools appear more like successful schools. 
Contrast school #11 has ratings that taken together suggest it is a 
successful school. In fact, contrast school #11 looks more like a successful 
school than do successful schools #1, #5, and #10.. 

Thus, in addition to the original two groups of schools the study 
has identified two other groups. One includes successful schools that 
look more like contrast schools. These are schools with students achieving 
national norms on tests, but otherwise these schools employ practices 
and procedures not judged to be different from those found in mediocre 
schools. The other new group differs because its procedures and practices 
were judged to be above the normal experience, but the students have low 
test scores. Then two new groups seem in some way to be exceptions to 
some rule or rules. 

The four groups are mutually exclusive, but there are underlying 
connections between the groups that are illustrated in Table 4. 
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Table 4 



Test Scores 



Processes 



positive 



negative 



positive 



1. 



True Positive 



2. 



False Positive 



negative 



3. False Negative 



4. True Negative 



The rows of Table 4 separate schools according to whether an inspection 
of their procedures and practices lead to a positive or a negative 
judgment. The columns classify schools according to test scores, which 
are some measure of the outcomes of the school operations. In cell #1 
are schools with positive judgments of processes and with good test 
scores. These are labeled True Positive schools. In cell #4 are schools 
judged low on processes and low on test scores, and they are called True 
Negative schools. Cell #2 contains schools with positive approaches but 
with low test scores, so they are labeled False Positive schools. 
Finally in cell #3 are the schools with weak appearing practices and 
procedures with nevertheless good test scores, and they are called False 
Negative schools. 

The twenty study schools had already been classified by the columns 
when they were selected, and the staff later classified them by rows in 
order to identify which schools fell in which cells. Schools were 
considered positive on processes if four or more of their ratings were * 
above three and thereby in the range for which the judgment was that the 
school was a successful school. The negative schools had none, one, or 
two ratings above three, so their ratings were predominantly low and 
they thus had been judged to be not successful. Two of the twenty 
schools had three good ratings, not enough to be judged positive, but 
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too many to be negative, and those two were excluded. Table 5 shows how 
many study schools fell in each of the cells. 

Table 5 



The existence of four groups makes possible different analyses of 
success in schools according to the overall level of their procedures 
and practices. Thus, interesting comparisons are possible between True 
Postive and False Positive schools to see if there are suggestions of 
determinants of success, and similarly False Negative and True Negative 
can be compared with the same objective. 

As before, however, care must be taken that external influences are 
not operating to explain group differences. Accordingly, the project 
staff computed the means by cell of the poverty measures, percent of low 
incomes and percent of free milk or lunch. Tables 6 and 7 report these 
means: 

Table 6 
Mean Percent Low Income 



47% 


39% 


39% 


36% 



/ 
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Table 7 

Mean Percent Free Milk of Lunch 



58% 


50% 


51% 


47% 



Inspection of Tables 6 and 7 shows that in each row the successful schools 
had higher poverty indices than did the contrast schools » so test score 
differences cannot be explained by the usual model in which poverty is 
associated with low test performance. 

Tables 6 and 7 also show that the True Positive schools, those high on 
test scores and high on ratings » have the economically poorest students among 
the four types of schools. This difference is of course among a specially 
selected set of schools from cities and not among a representative set of 
schools » but the existence of the difference does suggest that some schools can 
and do reverse the trend for low performance to be associated with high 
poverty. 

Poverty differences do not help to characterize the schools in the four 
cells; other ways to characterize the schools are needed, and the study factors 
are a useful source of other ways. As an aid in reviewing factors, the ratings 
on the factors for these four schools are repeated in Table 8. 
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Table 8 

Ratings of Factors for True Positive Schools 





bcnooi 


Code 










L 


T 


A 
*f 


7 


A. 


Leadership 


A 

4 


^- 




A *■ 


B. 


Coordination 


A 

4 




1 

J- 


J~ 


C. 


Additional Reading 
Personnel 


2 


3 


z* 


4 


D. 


Atmosphere 


4 


4 


4 


2 


E. 


Individualization 


2* 


2 


3- 


4- 


F. 


Evaluation 


3- 


2 


3 


3 


G. 


Expectations 


3* 


2* 


3- 


3 


H. 


Strong Emphasis 


4 


4 


3* 


4* 


I. 


Use of Phonics 
Basal 

Supplementary 


3.0 
3.1 


3.1 
3.2 


3.0 
3.7 


3.0 
4.2 


J. 


Training and Experience 


3- 


4- 


3* 


4- 


K. 


Quality of Teaching 


4 


3* 


4 


2 




Inspection of the ratings in Table 8 


shows the 


four schools 


were 



not high on all the ratings, and that suggests that it is not necessary 
to be high on ail factors to be a True Positive school* Thus, not all 
factors are necessary to achieve success. Further, there is not a large 
set of factors on which the four schools are uniformly high, and the 
schools are uniformly high only on Strong Emphasis on Reading. 

There are three factors for which three of the four schools were 
judged to be high: Atmosphere; Training and Experience; and Quality of 
Teaching. The one school that was not high on Atmosphere is an open 
space school where there were low readings made of the purposeful and 
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quiet scales. That same school was rated low on Quality of Teaching. 
Since that school did not rely solely upon traditional teaching techniques 
because it uses, for example » student contracts as part of its program, 
the low rating in Quality of Teaching may not be too critical. 

There are two factors for which three schools Were judged to be 
low: Coordination and Individualization. Again it is the open school 
that is the exception on Individualization; the traditional schools are 
all low. 

Overall, these schools are positive, but there are substantial 
variations among the ratings assigned them and even among the anecdotal 
reports from the visiting teams. Report ; about the classrooms varied 
from "rooms colorfully decorated with student work; pleasant, friendly, 
yet structured and controlled atmosphere; and the presence of special 
personnel to observe classroom work of students who are candidates for 
work in the Learning Center" to "generally barren classrooms; detached, 
preoccupied teachers; and emphasis on recall with no especially probing 
questions being posed." 

The different patterns of strengths and weaknesses among the successful 
schools suggests that these schools used different approaches to achieving 
success and that one model for success is not appropriate. 

For the opposite extreme, the True Negative schools, the ratings 
are repeated in Table 9. These are schools judged to have low performing 
students because the schools do not demonstrate strengths in what they 
do, and the judgment is borne out by low test scores. 
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Table 9 

Ratings of Factors for True Negative Schools 



School Code 






13 


16 


18 




A. 


Leadership 


1* 


3* 


1 




B. 


Coordination 


2* 


2- 


1 




C. 


Additional Reading 
rersonnei 




o 






D. 


Atmosphere 


4 


3- 


3 




c. 


inaiviouai ization 






it* 




r • 


cvaiuation 


Z 


2 


z~ 




G. 


ExDec tat ions 


2 


2 


3 




H. 


Strong Emphasis 


3- 


2* 


3- 




I. 


Use of Phonics 
Basal 

Supplementary 


3.1 
3.8 


2.7 
3.6 


2.7 
3.7 




J. 


Training and Experience 


3 


1* 


2 




K. 


Quality of Teaching 


3 


2 


3* 





The ratings of the True Negative schools are indeed low» but they are not 
always all low on the same factors* Typically one or two schools have 
ratings below 3» but on four dimensions all three schools have ratings 
below 3» These are Coordination^ Individualization* Evaluation* and 
Strong Emphasis on Reading* factors in which the three schools share 
weaknesses* 

Among the other factors there are none for which all the True 
Negative schools show strength* but there is evidence to show that not 
all of these schools are totally lacking in good practices. The ratings 
in Atmosphere and the mean ratings in the Use of Phonics in supplementary 
materials* while rather mediocre* are not as low as are other ratings 
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for this group. 

For the False Positive schools, those where good observations and 
judgnents were mde but good test scores were not. Table 10 sumarizes 
factor ratings. 



Table 10 

Ratings of Factors for False Positive Schools 





School 


Code 












11 


14 


15 


19 


20 


A. 


Leadership 




3 


3* 


2* 


3* 


B. 


Coordination 


4 


2* 


2 


1* 


4* 


C. 


Additional Reading 
Personnel 


4 


4- 


3* 


3- 


3 


D. 


Atmosphere 


X 


3- 


34. 






E. 


Individualization 


3- 


3* 


2* 


2* 


1* 


F. 


Evaluation 


3* 


3 


3* 


2 


2* 


G. 


Expc tat ions 


2- 


2* 


2 


2* 


3* 


H. 


Strong Emphasis 


4 


4- 


3* 


3* 


4* 


I. 


Use of Phonics 
Basal 

Supplementary 


3.1 
2.9 


2.9 
3.8 


3.3 
3.7 


2.1 
3.6 


3.0 
3.9 


J. 


Training and Experience 


4- 


34 


3- 


4 


2* 


K. 


Quality of Teaching 


4- 


4- 


3* 


4 


3* 



These five False Positive schools all place a Strong Emphasis on 
Reading and display high Quality of Teaching. Three of the five have 
high ratings in Atmosphere, Use of Additional Reading Personnel, Training 
and Experience, and Leadership. 
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It is difficult to characterize these False Positive schools because 
there is evidence that they evplcy positive practices and yet their 
students do not perform well. It is difficult » also» to determine why 
there is no evidence that the positive practices do not result in good 
stiKlent performance unless* perhaps » it takes time before the positive 
practices lead to good results. The hypothesis? that time is needed is 
consistent with Weber* s assertion that his successful schools required 
as many as nine years to achieve success. So, one possible characterization 
of the False Positive schools is that they are changing the effects upon 
piqpil performance and are thus in transition. It is unfortunate that 
this study was not longitudinal so that changes could have been part of 
the study, but from the beginning this was a study of schools at a fixed 
mment. Until additional data over time are available it can only be 
conjectured that some of these schools have made seme changes in processes 
but that the effects of these changes nust await the time necessary 
before students can reflect those changes in their performance. 

The last cell labeled False Negative contains schools whose procedures 
do not fare well upon observation but whose students do well on tests in 
reading. Table 11 contains the factor ratings for these schools. 

These schools generally show low ratings, and all six are low on 
Coordination, Additional Reading Personnel, and Evaluation. Five of the 
six are low on Leadership, Individualization, and High Expectations, and 
all but two are rated as lev in Atmosphere. The few high ratings that 
exist among these schools are dispersed over several factors, and so no 
single factor appears high in all these schools. 

The pattern of Iw ratings among these schools suggests that their 
students should not perform well on reading tests, but they do. There 
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Table II 

Ratings of Factors for False Negative Schools 





School Code 






1 


s 




8 


9 


10 


A 






5 


2 


2 


2 


1 


B. 


CoordiiiAt ion 


2^ 


2- 


3- 


2 


2 


!♦ 


C. 


AdditiofiAl Rftadint 
Personntl 


3- 


2* 


2 

* 


3- 


2 


3- 


0. 




2^ 


2* 


2* 


4- 


5* 


2 


E. 


Ind i V icfaial i zst ion 


2 


J- 


2- 


2 


2- 


3 


F. 


Bv«luation 


2^ 


2 


2 


2* 


5- 


2* 


G. 


Fniicta^ ions 




2 




1 


2 


2 


H. 


StrcMig &B|^sis 


2* 


2 


3 


3 


5* 


3 


I. 


Use of Phonics 
Basal 

SuppltMiitary 


2.9 
2.8 


3.0 
3.4 


3.0 
3.6 


3.0 
3.2 


1.7 
3.6 


2.2 
3.7 


J. 


Training and Experience 


2 


2 


4 


3 


3 


2- 


K. 


Quality of Teaching 


2- 


3* 


1* 


2- 


3 


3* 



are a mnd^er of possible explanations for this strange difference between 
expectancy and actual results. There »ay be a lack of reliability and 
accuracy in the observations aiul ratings of the sv.hools» but the use of 
aany different specialists and the establishaient of consensus aacmg then 
Bakes this an unlikely explanation. Another possible explanation is 
that the factors rated have no bearing at all upon the results of testing 
and that something else is needed to explain the good results. Hie 
study staff reviewed the data collected on all the schools and found 
that these six schools had connon attritmtes other than the Ivm factor 
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ratings. Cowm thaws that prmrailed in these schools itere e«f^»es 
upon discipline and drill. The observers found several instances in 
which the staffs looked \xpon the stikients as adversaries, deaasHling of 
the» (^l>ediem:e and quiet. In the classroMS observed there Mre tiMs 
idhen entire classes itere occupied with mrfcsheetSp or teachers spent 
extensive tiM in drill and practice activities. 

The fact that these are drill and practice schools suggests wre 
than the obvious conclusim that drill and practice in basic reading 
skills can result in students achieving well in tests of ttose saw 
basic reading skills. Put another way, these sctools teach the skills 
the tests wasure. But these schools were judged to be weak on a miaber 
of factors that involve sow good practices in schools » and the test 
scores do not reflect their weaknesses. This suggests the possibility 
that the tests themselves are inadequate to neasure all the behaviors 
that nake up reading. 

Just as Coleaan and others were wrong to use wasuras of quantity 
to characterize the inputs of schools so is it wrong to use solely 
quantitative measures of school outputs. It is not new to say that 
there is nore to reeding than what reading tests neasure, but the presence 
of the False Negative sctool in this study does iK>int up the fact that 
sow schools probably fail to help their students in the qualitative 
aspects of reading. Since » h<niwver» there wtre no other measures of 
outcows it is i^K>ssible to verify that these schools do have failings 
despite their good test scores. 
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Co«paris<ms Awont thm Fcwr Types of Schools 

leside the foregoing analysis of the characteristics of the schools 
in each of the four cells anotl^r analysis involves coMparisons monji 
the four classes of schools to identify hon they differ and they are 
alike. Table 12 prOv^Jes a suHMry of the distrilmtion of the ratings 
in each cell. Mien all the schools in a cell or all but one have factor 
ratings tl»t are high (above 3), that factor is listed in the cate|p>ry 
**High** for the coluan that corresponds to the gro^. Similarly • idim 
all or all but one of the factor ratings in a group are below 3. that 
factor is listed in the *'Lov** category. VIm the ratings are divided 
betveen high and lorn, the factors are listed imder **Mediu».** In seveial 
instances the factor ratings are not evenly divided tKit sh<m a tendency 
(for exaaipler four out of six or three out of five) to be high or low, 
and the factors are listri as ^Tending High** or 'Tending Low.** 

Since the Wionics ratings are based vq^on a different scale, thi 
suanary of those ratings was* achieved differoitly. A school rating for 
Phonics was considered to be high if it was aore than 0.1 above the 
nedicn for all the schools and low if it was nore than O.i i^low the 
nedian. In each category of schools the Phonics ratings are quite nixed 
and the sionary places all Phonics^SiqppleMntal ratings by category as 
nediun and all but one category as nediin for Phonics- Basal. Thus, the 
Phonics ratings do not serve to differentiate aaong the fcHir categories 
of schools » and» in essence » the f<Hir categories are quite alike in the 
use of phonics. 

The fcmr categories of schools are quite a.Uke also in ratings of 
Coordination and of Indivi<faialization. In every instance the Indvidualization 
ratings are low» and in all but one case the Coordination ratings wer^ 
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low. It is interesting to note, therefore, that the four classes of 

schools and all the schools individually were judged to be low on Coordination 

and Individualization. Further, these two factors, as is the case with 

the Phonics factors, do not differentiate the classes of schools and can 

be ignored for purposes of contrasting the schools to see how they 

differ. 

The positive schools, true and false, naturally have higher ratings 
than do the negative schools by virtue of the process by which the 
positives and negatives were identified, but the positive schools differ 
more on some factors than on others. The ratings of the positive schools 
are uniformly high on Emphasis on Rt ding and show the greatest difference 
over negative schools, a difference of more than one on the five point 
scale of ratings. The positive schools are high on the Quality of 
Teaching factor and show almost a whole point difference in ratings over 
the negative schools. On Leadership the positive ratings were not all 
high, but because the rating of the negative schools were so low the 
differences were more than one point on the five point scale. 

The true positive schools have high ratings on Atmosphere and the 
Training and Experience of staff while the false positive schools show 
only a tendency to be high on these factors. On the other hand, the 
false positive schools rate somewhat higher on Leadership and the use of 
Additional Reading Personnel. Thus, the false positive schools show 
some ev\dence--by Leadership, Strong Emphasis on Reading, Quality of 
Teaching, and Additional Reading Personnel--of good practices but that 
Atmosphere and Training and Experience have yet to be high. These same 
schools also have low ratings of Expectation, lower than the ratings for 
true positive schools and, in fact, more like the ratings of the negative 
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schools. Given a school functioning in ways that should help students 
to learn it is hard to understand why the Expectations are so low. 

There could be several explanations for the low Expectations of the 
false positive schools. First, there could be a circularity about 
Expectations such that a staff aware of low pupil performance consciously 
or unconsciously adjusts expectations to that low level. This circularity 
could explain why the true positive schools have higher Expectations 
their students do better on tests. However, this explanation suggests 
that the false negative schools whose students do well on tests should 
have other than the low Expectations that they have, and yet another 
explanation is needed. Perhaps the staffs of the false negative schools 
believe their students are not good readers even if the test scores are 
good, and that possibility is consistent with the earlier assertion that 
there are qualitative aspects of reading that are not measured by standardized 
tests. 

A second possible explanation for the low Expectations among false 
positive schools is not unrelated to the first nor is it exclusive of 
it. If, as suggested earlier, it takes time for efforts made by a 
school to improve reading to result in good pupil performance in reading 
tests, then the circularity principle suggest.- that the Expectations are 
also in transition and will rise over time. Still it is difficult to 
understand why these schools do not yet show evidence of setting high 
standards for their students to achieve. 

Table 12 shows that the negative classes of schools have no ratings 
above mediocrity and have a large number of low ratings. Both classes 
of negative schools share low ratings on Expectation, Leadership, use of 
Additional Reading Personnel, and Evaluation. In addition, the true 
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negative schools are lowest of all on Emphasis on Reading and Staff 
Training and Experience, but they do have slightly higher ratings on 
Atmosphere. With only these exceptions, the two classes of negative 
schools look quite alike- -even more alike than do the two positive 
classes of schools. The true negative schools do indeed look negative, 
but it is hard to find an explanation for the false negative schools who 
also look negative but whose students do well on tests of reading. The 
factor differences are in Training and Experience and in Emphasis on 
Reading, but it appears that the increased Emphasis on Reading took the 
form of drill and practice in those skills measured by standardized 
tests and in little else. Once again there is the suggestion that the 
false negative schools, while helping students to do well on standardized 
tests, do little to help students achieve qualitative outcomes on reading. 

Impact of the Four Types of Schools 

The analyses of the ratings of the schools in the four cells first 
to characterize each cell and then to see how the four types of schools 
are alike or different has helped to show how complex is the characterization 
of schools. It is not enough to characterize schools as merely good or 
bad nor is it enough to simply characterize schools as good or bad on a 
set of dimensions. Schools may be good or bad on test scores, but test 
scores do not adequately represent all the outcomes of a school, and 
other outcome measures are critical. Such measures, while not available 
to this study— or even to many other studies for that matter — involve 
the qualitative aspects of reading. Some schools may be good or bad on 
quantitative measures- -standardized tests — quite independent of how they 
do on qualitative measures of outcomes. Similarly schools differ on 
their inputs to the learning process and may differ on qualitative and 
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quantitative measures of inputs. Thus is the characterization of schools 
quite coBpleXt perhaps even nore complex than the four cells described 
here. 

The four eel Is » however, serve to describe two dichotomies, one for 
how well a school is following good practices and one for its outcomes. 
One cell represents good outcomes and good practices; another represents 
bad outcomes with the absence of good practices. The other cells represent 
schools that somehow do not match outcome to expectation: the false 
positive school falls below expectation; the false negative school 
exceeds its expectation. 

The four cells become useful not only for characterizing the different 
relationships between outcome and expectation, but they help to provide 
two important inferences concerning the schools whose outcomes do not 
match expectation. The false positive schools whose outcomes fall short 
of expectation appear to be schools in transition. Of course, their 
transition can logically be in either direction: they may be on their 
way to achieving successful outcomes; or they may be dropping off in 
some or all of their good practices. Further study of such schools over 
time is necessary to establish the direction of change, and such a study 
will also help to pinpoint how different strategies for change operate. 
Since true positive schools have different patterns of excellence in the 
study factors it is likely that their success was achieved by different 
strategies. A study of schools in transition can help to identify how 
different strategies for different schools can result in movement toward 
or away from excellence* 

The other set of unusual schools, the false negative schools, have 
good test scores, better even than would be expected frc^ inspection of 
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their processes. Their narrow dedication to drill and practice does 
seen to result in good performance on reading tests, but the natural 
question is whether these schools would fare as well on outcome measures 
that go beyond elementary skill levels to assess reading behavior, 
learning by reading, and the like. This suggests further that were 
those outcome measures available for those schools they would no longer 
appear to be good on all outcomes. But to say they may not be entirely 
good requires a judgment about what outcomes or objectives should be met 
by a school. Some schools may in fact aspire to teach only the basic 
skills as measured by standardized tests and not try to impart the more 
qualitative aspects of reading. For those schools good test scores may 
be enough to them. Other schools may aspire to aore than what standardized 
tests measure, but such schools must be able to describe what they hope 
to achieve in terms sufficiently behavioral that measurements can be 
made by having students display those behaviors. For those schools new 
tests and measures are needed that differ from existing standardized 
tests, and that, of course, is what criterion- referenced tests are for. 
Preferred outcomes can differ from school to school and so can the 
measurement of those outcomes. 

Though schools may differ in their aspirations it seems reasonable 
to assume that the true positive schools represent the richest form of 
goodness of the schools in this study, but even they are not high on all 
factors which fact shows that they could be improved. For exaiq)le, they 
have low ratings on Coordination and Individualization which show room 
for improvement. There are four factors (Leadership, Expectations, 
Additional Reading Personnel, and Evaluation) on which the true positive 
schools are given mixed or, on the average, medium ratings. Still, these 
four factors give an interesting suggestion about the dynamics of some school 
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factors because while they appear inconclusive-*as if they are not essential 
to a school *s achieving success**the ratings of the tnie positive schools 
are still higher than those of the negative schools. These same four 
factors account for all of the low factors of the false negative schools 
except for the Coordination and Individualization factors ^ich do not 
differentiate aaong the four classes of schools. And all but two of the low 
factors aaong the true negative schools are similarly accounted for. Even though 
the ratings of the four factors for the true positive schools are medium and 
inconclusive they are still much better than the low ratings for the negative 
schools, and differences between magnitudes become important quite apart frcm 
the magnitiKles themselves. 

Looking at the magnitudes of differences has been an approach used 
in other studies, such as the study by the New York State Department of 
Education (1974) in which a successful school and an unsuccessful school 
were compared. That study concluded that leadership, atTOsphere, and 
emphasis on reading were critical factors, factors that to varying 
degrees are supported by this study* Other factors reported in the 
Weber study (additional reading personnel, evaluation, and expectation) 
also receive some support by this study. This study does suggest, 
moreover, the importance of quality of teaching (which was not deemed 
essential in other studies) and of staff training and experience^ both 
factors related to the teacher in the classroom* 

It may not be important, howeVer, to note areas of agreei^nt and of 

disagreement nor to try to draw inferences from those areas because this 

study has shown that successful schools differ among themselves in their 

patterns of factors. Thus, it may not be appropriate to search for 

• unique factors or patterns of factors that separate good and bad schools 

but instead to concentrate on the fact that differences between good and 

O bad schools do exist and are discemable. Just as it is recognized that 
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schools differ and that schools differ in how they achieve success* the 
critical concern should be with the process by which a school finds out 
what kind of a school it is and then develops strategies for iwprovewent 
in the teaching of reading. 

Iiaplications for Further Work 

A school could determine irfiat kind of a school it is by using 
directly the approach used in this study. It could enlist outside 
observers who would visit the school* use the instrunents developed for 
this study* develop the ratings of factors by consensus* study outcoae 
measure* and finally identify which of the four cells the school falls 
into. There are a ntaber of reasons why this approach is not appropriate* 
however. First* it is not necessarily best to enlist outside observers 
in the process because then the school and its staff htcome passive 
actors to be observed precisely when it is best to involve then in the 
process of finding* ultiaately* how to achieve isqprovenent . Thus* the 
staff can be directly involved in collecting so»e of the data* particularly 
those which are statistical. The clinical data require often a type of 
objectivity that oMiy be difficult to expect fro« self- inspection* but 
those data can be collected by peers fron other schools by neans of sense 
collaborative efforts. Schools* particularly those in a given district, 
now collaborate in workshops and flieetings in which ideas are shared. 
Even across district lines schools now collaborate* for exanple* to 
judge each other for accreditation by the New England Association of 
Schools and Colleges* but that collaboration is quite fomal and potentially 
punitive. A fon of collaboration soMwhere between the informal district 
meeting of teachers and the accreditation sessions is needed. Such a 
collaboration would involve several schools helping to observe each 

^ Educational Research Corporation 



-45* 

other and wcKild involve the collective staffs in a foni of introspection 
ami concern for i#hat they are doing that in and of itself is good and 
healthy. 

Second, it is not appropriate to use directly the instruments froa 
this study because the instruaents should be made nore sensitive to the 
fcmr types of schools. Further, the instruments should be expanded to 
deal, as we shall see later, with the collection of additional data 
about strategies and outcomes and in addition, the staffs should be 
trained in how to use the instruments. 

Just as with the use of peers to make observations and collect 
certain data, it is desirable to use peers to develop the ratings of 
factors by consensus rather than relying solely upon outsiders. Still, 
it will be important to have the rating teams trained and practiced in 
the consensus process. 

To study outcome measures as they were in this study- -and in most 
other studies as well-->by relying upon standardized tests B»y be inadequate. 
The schools themselves must determine what outcoM measures are i^>ortant 
by first deciding what objectives they have for reading performance and 
expressing these in behavioral terms that allow for measurement. For 
sooie of such objectives standardized reading tests may be appropriate, 
and for others some existing instruments that are not standardized may 
be appropriate. Exaaples of such instruments are those used in the 
national assessment program of the Education Commission of the States on 
the instruments already in assessment use in the Ccwaonwealth by the 
^tossachusetts Department of Education. These existing instruments would 
be very useful for some outcome measures and not for others, and other 
techniques of measurement way be necessary. Some such techniques may 
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include unobtrusive neasurments of reading behavior» for example the 
use of books in the library. Care aaist be taken that when ai^ropriate 
such techniques reveal qualitative behaviors and not just quantitative. 
Thus, it is iiqK>rtant to go beyond just counting how aany books are 
checked out of a library. Bat» this introduces additional decisions by 
the schools because it is not sufficient to sake judgments about difficulty 
levels of books taken out or the topics covered in the books taken out, 
because individual interests vary so naich that given books have different 
effects upon different f^ple. It is better that the schools search for 
evidence that student reading shoiis evidence of the transfer of reading 
skills to other forms of learning, to language developinent , or even to 
simple enjoyment. 

MheA a group of schools has finally determined how each of tM 
schools stands in terms of its practices and in term of its outcome 
measures, it is then important to discover what the effects are of 
different strategies for change. To acconplish this task the schools 
then become laboratories and the school staffs becc^ the researchers 
who will follow the schools over time» particularly those schools identified 
as in transition. The continuation over time of the observing of the 
schools will allow the researchers* -the school staffs acting collalH>ratively*- 
to investigate what oqphases and what changes the schools are making. 
Instruments can help to find out ^AiSLt strategies the schools believe 
they are following. (M>servations over time will substantiate ti^ether or 
not the> are doing as they say they are doing or identify any changes of 
which th«.r? is no direct awareness. 
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ifhm the schools succeed in using theuelves as laboratories to 
find over tiae what kinds of schools they are ami what kinds of schools 
they are becoming by virtue of different strategies then the schools 
becMie better eble to define for thesselves strategies appropriate for 
theft to achieve the particular success they seek. This fon of action 
research seems nost appropriate for schools to use to achieve success 
for several reasons. First, schools my appropriately define for thesselves 
different definitions of success; no one definition should be imposed 
externally. Second, schools say elect different strategies to achieve 
success; no strategies should be imposed externally. Finally, the 
possibility that sdttools become their own agents for change and improvement 
is not only consistent with the policy of local control of schools but 
it places the control in the schools where the capacity to determine the 
most meaningful strategies for success exist. 



NOTES 



Note 1 



Note 2 



Note 3 



ftote 4 



In the tenth city there was soon to be a new superintendent, 
and the incumbent preferred not to make a commitment to 
participation in the study for his successor. 

The contrast schools were not the schools with the lowest test 
scores since such schools did not always match on poverty or 
other measures. 

All the instruments used in this study are not reproduced in 
this report because the findings of this study result in the 
recommendation to be noted later that the instruments should 
be refined to account for additional data, and the instruments 
shmild therefore not be used directly. Copies of the instruments 
are, however, available upon request made to Educational Research 
Corporation, 85 Main Street, NaterttHm, Massachusetts 02172. 

It was not necessary to use S*^ or 1* because ratings of S and 1 
already expressed a conviction of certainty. 
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